Models for Approximating Basilar 
Membrane Displacement 

By J. L. FLANAGAN 

(Manuscript received April 1, 1900) 

Three analytical models arc developed for estimating the displacement 
of the basilar membrane in the hitman ear when the sound pressure at the 
eardrum is known. Frequency-domain data, derived experimentally by 
Bekesy, arc Fourier-transformed to examine the impidse response of the 
membrane. Time-domain and frequency-domain responses of the models 
are compared with the experimental data. Excitation of the models by peri- 
odic impulses is considered. Calculations of membrane displacement are 
made for excitation by positive pulses, and by alternately positive and nega- 
tive pulses. Applicability of the results to the perception of pitch is indicated. 

I. INTRODUCTION 

In the course of developing an hypothesis to account for results ob- 
tained in two experiments on pitch perception, 1,2 it became desirable to 
have a tractable model from which the displacement of the basilar mem- 
brane at a given point could be estimated from a knowledge of the sound 
pressure at the eardrum. This report describes the results of an effort to 
deduce such a model. 

II. MECHANICAL PROPERTIES OF THE MIDDLE EAR AND COCHLEA 

To recall facts and establish a frame of reference, a simplified sketch 
of the peripheral mechanism of hearing is shown in Fig. 1. The cochlea, 
actually wound in a snail-shell-like spiral in man, is sketched here un- 
rolled and stretched out. It contains the perilymph fluid and is parti- 
tioned longitudinally by a duct formed by Reissner's membrane and the 
basilar membrane. The duct, roughly triangular in cross section, is filled 
with another fluid, endolymph. Resting upon the basilar membrane 
within the cochlea duct is the organ of Corti. This organ, immersed in 
the endolymph, serves as the termination of the auditory nerve. Bekesy 3 
has established that the basilar membrane and Reissner's membrane 

1103 



1164 THE BELL SYSTEM TECHNICAL JOURNAL, SEPTEMBER 19G0 




SEMICIRCULAR 
CANALS 



-oval window __ scala 
„«* vestibuli 

'^^""Tbas.lar hel.cotrema 
round \ scala membrane 

WINDOW tymPANI 



Fig. 1 — Schematic drawing of the human ear. 

vibrate cophasically when the ear is stimulated by sound in the lower 
range of audible frequencies. Because Reissner's membrane does not 
enter into the present development, only the basilar membrane is 
sketched in the schematic diagram. 

A sound wave impinging on the ear is led down the external canal 
and sets the drum into vibration. The vibration is transmitted by the 
ossicular chain to the cochlea, where the piston-action of the stapes 
foot-plate produces a compressional wave in the fluid. Because of its 
distributed mass and elastic and viscous constants, and because of the 
pressure release at the round window, the basilar membrane vibrates 
selectively according to the frequency content of the stimulus. Displace- 
ment of the basilar membrane causes pressure to be exerted (by another 
membrane in the cochlea duct, the tectorial) upon the hairs emanating 
from hair cells in the organ of Corti. When the hairs are sufficiently de- 
formed, electrical discharges are produced in the nerve fibers. 

The mechanical properties of the cochlea have been studied in detail 
by Bekesy. 4 He found that, when the stapes is driven sinusoidally with 
constant amplitude of displacement, the amplitude of displacement of 
points along the low-frequency (or apical) end of the basilar membrane 
varies with frequency as shown in Fig. 2. The peak displacement of 
each point is normalized to unity. His measurements 3 of the difference 
in phase between the displacement of the stapes and the displacement of 



BASILAR MEMBRANE DISPLACEMENT MODELS 



1165 




50 



100 200 300 500 1000 

FREQUENCY IN CYCLES PER SECOND 



2000 



5000 



Fig. 2 — Relative amplitude of displacement as a function of frequency for 
different points along the basilar membrane. The stapes is driven with constant 
amplitude of displacement (after Bekesy 4 ). 

points along the membrane are sketched in Fig. 3. In addition to these 
data, Bekesy found 5 that, when the sound pressure is constant at the 
eardrum, the magnitude of volume displacement of the round window 
is nearly constant up to around 2000 cps. To the extent that the peri- 
lymph is incompressible and the walls of the cochlea rigid, the volume 
displacement of the round window is equal that of the stapes footplate. 
Data reported by Zwislocki 6 and by Bekesy 5 indicate that, for fre- 
quencies below 1000 cps, the over-all impedance of the middle ear and 
cochlea is predominantly elastic, owing principally to the compliance of 
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Pig- 3 — Relative amplitude and phase of basilar membrane displacement as 
a function of distance along the membrane (after Bekesy 3 ). 
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Fig. 4 — Ratio of volume displacement of stapes to peak displacement of basi- 
lar membrane (after Bekesy 4 ). 

the middle ear air cavity, the round window membrane and the liga- 
ments retaining the ossicles and drum. For these frequencies, therefore, 
the displacement of the stapes is essentially proportional to, and in 
phase with, the sound pressure at the eardrum. At frequencies above 1000 
cps, the inertial and viscous elements of the middle ear and cochlea 
become more important, and the velocity of the stapes apparently may 
lag in phase the pressure at the drum by as much as ir/2 radians or more 
(hence, the stapes displacement may lag the pressure by as much as t 
radians or more). For frequencies above about 1000 or 2000 cps, the 
indications are that amplitude of stapes displacement begins to decrease 
appreciably for constant pressure at the eardrum.* 

Because the physical dimensions and mechanical properties of the 
basilar membrane change along its length (for example, the membrane 
increases in width, thickness and compliance going toward the apical 
end), the volume displacement of the membrane per unit length, per 
unit pressure across it, changes with distance from the stapes. For a 
constant amplitude of stapes displacement, therefore, the amplitude of 
the maximally displaced point is not constant with frequency. Bekesy 4 
gives the ratio of amplitude of volume displacement of the stapes to 
amplitude of the maximally displaced point, as shown in Fig. 4. These 
data show that, for frequencies below 1000 cps, the amplitude of the 

* Zwislocki's data suggest a decrease of the order of 12 to 18 db/octave; Be- 
kesy's average data seem to agree roughly with this. In one preparation, however, 
Bekesy obtained a fall of about 30 db/octave. 
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maximum increases approximately 4 or 5 db/octave. At around 1000 cps 
the curve flattens off. 

In measurements of the absolute value of membrane displacement, 
Bekesy finds the maximal displacement at 200 cps to be 10~ 4 cm at the 
threshold of feeling (about 140 db referred to 0.0002 dyne /cm 2 ) and, 
through extrapolation, 10~ H cm at the threshold of hearing.* For a given 
frequency and a given point on the membrane, Bekesy's data indicate 
that the mechanical vibrations of the stapes and basilar membrane are 
essentially linearly related until sound pressures above the threshold of 
feeling are reached. There is evidence, however, that the ear is capable 
of producing perceptible subjective components at sound levels less than 
this value. 

As stated at the outset, we desire an analytical relation for estimating 
the basilar membrane displacement at a given point from a knowledge 
of the sound pressure at the eardrum, valid at least in the frequency 
range below 1000 cps. It is in this range that the stapes displacement is 
in phase with, and proportional to, the pressure at the drum. The experi- 
mental data that the model must describe are the frequency-domain 
data just discussed. The approximation problem may, of course, be ap- 
proached in either the time or frequency domains; usually it is helpful 
to maintain some insight in both domains. Consequently, we would 
first like to inquire as to the form of the displacement response of a point 
toward the low-frequency end of the membrane to an impulse of pres- 
sure applied at the eardrum. 

III. INVERSE FOURIER TRANSFORMATION OF BEKESY'S DATA 

The phase data of Fig. 3 are at best meager, but they are most defini- 
tive for the 200-cps point. Let us, therefore, take the 200-cps point for 
a sample calculation. Deducing the phase response from Fig. 3,f and 
taking the amplitude response from Fig. 2, we may plot the data as 
shown in Fig. 5. J Let us make two assumptions about the system with 
which we are dealing: first, the impulse response, h(t), of the point 
under consideration is Fourier transformable (i.e., j- x h 2 (t)dt < » ) ; 
and second, the system is a stable one having no complex poles with real 

* The diameter of a hydrogen atom is about 10~ 8 cm. 

t Because peak displacement increases at around 5 db/octave, the possibility 
exists that the displacement of the point that responds maximally to a given fre- 
quency might not be the greatest displacement of the membrane for that fre- 
quency. However, the frequency response of a given point generally rises at a rate 
greater than 5 db/octave in the vicinity of its resonance; consequently, the great- 
est displacement occurs essentially at the maximally responding point. 

X As closely as I can determine from the Akustische Zeitschrift data, the maxi- 
mum displacement of the "200-cps point" falls at about 210^-220 cps. 
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Fig. 5 — Displacement amplitude and phase for a point near the apical end of 
the basilar membrane. Maximum response occurs for a frequency of about 200 
cps. These curves are obtained from data in Figs. 2 and 3. 

parts equal to, or greater than, zero (i.e., the system exhibits no output 
until an input is applied, and the final value of the impulse response is 

zero). 

Taking the data of Fig. 5 as the magnitude, | H(a>) |, and phase, *(«), 
respectively, of the Fourier transform, H(a>), of the impulse response, 
h(t), we wish to calculate the inverse transform: 

(1) 



h (t) =1 f° H( u )e*" du. 

ZTT J- oo 



In Cartesian form, H(u) is 

H(u) = Reff(w) +j"ImH(«), 
where 

lleH(a) = \H(u) | cos$(o>), 

Imtf(w) = \H(a) | sin*(cu). 
Because Re H(u) is an even function of w and Im H(w) an odd func- 
tion, (1) reduces to: 



(2) 



If 00 If 00 

h(t) = - / Re H(<a) cos u>t dw / Im #(a>) sin at do> 

T Jo IT Jo 

= h(t) + hit), 



(3) 
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Fig. 6 — Real part of the Fourier transform, H(u), whose amplitude and phase 
spectra are given in Fig. 5. 



where th(t) is an even function of time and lu(t) an odd function. Be- 
cause of the assumptions regarding stability [i.e., h(t) = 0, for t < 0]: 

hi(t) = -lh(t) for t < 0, 
and 

h(t) = h 2 (t) for t > 0. 

Hence (3) can be written: 

2 r°° 

h(t) = - I Re //(to) cos cot dto 



(4) 



for 



t > 0. 



IT 



(5) 



To calculate h(t), then, only Re //(«) is needed. For the data of Fig. 5, 
Re H(<a) is plotted in Fig. 6.* 

In the absence of an analytical specification of Re //(o>), we have 
graphically evaluated the integral (5) by using the approximation: 

h(ti) = - 2 Re H(u n ) cos coJiAco, (6) 

IT n =0 



where : 



CO,, = ttcoo , 

coo = (2t)(10) radians per second 
Aco = (27r)(10) radians per second, 
U = (0.4 X 10~ 3 )z, i = 0, 1,2, ••• ,27. 



* Re H(w) was obtained from a large linear plot of | H (u) | and *(«), not from 
a semilog plot such as Fig. 5. Estimates, where needed (such as end points of 
curves), were made on the linear plot. 
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The impulse response computed by the approximation (6) is shown in 

Fig. 7. 

One notices that the graphical transform yields a nonzero value at 
t = 0, and suggests a nonzero response for t < 0. The reason for this 
might be one of several: (a) the phase and amplitude data of Fig. 5 may 
not be compatible to satisfy the assumptions made about the system; 
(b) the data of Fig. 5 suggest that the amplitude response may be band- 
limited, and it was so treated in the computation; (c) the quantization 
used in (6) may introduce an error in the calculation of MO- 

Of these three possibilities, the first two seem the more likely sources 
of discrepancy. The phase data in Fig. 3 suggest that at very low fre- 
quencies the phase difference between the displacements of the membrane 
and stapes is essentially zero. We know, however, that the scalas vestibuli 
and tympani communicate at the helicotrema. Consequently, a constant 
displacement of the stapes cannot sustain a constant displacement of 
the membrane. This argues, therefore, that the amplitude of membrane 
displacement must go to zero as zero frequency is approached, and the 
frequency-domain transform of displacement must have at least one 
zero at the origin of the complex frequency plane. If this is the case, and 
if the transform is minimum phase, the phase response near zero fre- 
quency must be at least tt/2. Intuitively, too, it appears that constant 
displacement near the helicotrema requires constant velocity of the 
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Fig. 7 — Impulse response of the point on the basilar membrane characterized 
by the amplitude and phase data of Fig. 5. The inverse Fourier transform is ob- 
tained by graphical integration of the experimental frequency-domain data. 
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stapes, arguing again for a derivative relationship between displacements 
at low frequencies. It seems likely then, that, as low frequencies are ap- 
proached, the phase of the membrane displacement begins to lead that 
of the stapes and at zero frequency goes to ir/2. Measurement of the 
phase relations at low frequencies undoubtedly is difficult, owing to 
minuscule displacement of the membrane. 

In connection with possibility (b), the amplitude data in Fig. 2 sug- 
gest that the membrane displacement is essentially band-limited and di- 
minishes to zero for frequencies below about 0.05 and above about 2.0 
times the resonant frequency. This should be interpreted, however, with 
an appreciation of the magnitudes of displacement being observed (on 
the order of 10~ cm) and the precision attaining thereto. In the graphical 
transformation, an effort was made to follow the experimental indications 
as exactly as possible. The amplitude function was treated as mathemati- 
cally band-limited and was considered to have zero value for frequencies 
above 400 cps and below 5 cps. This probably is not realistic for the 
physical system. 

Nevertheless, the inverse transform of the experimental data will pro- 
vide a helpful guide for appraising the responses of the models to be de- 
veloped in the next section. 

IV. MODELS FOR BASILAR MEMBRANE DISPLACEMENT 

A model for calculating the displacement of the basilar membrane at 
a given point must fit the frequency -domain data shown in Figs. 2 and 3. 
The response curves for various points along the membrane are not un- 
like those of bandpass filters having relatively sizable in-band delays. 
The peak values of the curves of Fig. 2 have been normalized to unity, 
but, as we recall from the previous discussion and from Fig. 4, the peak 
response rises at about 5 db/octave in the frequency range up to 1000 
cps. Above about 2000 cps, the peak response probably falls at something 
around 12 db/octave, and the stapes displacement is no longer in-phase 
with the pressure at the drum. 

If the data of Figs. 2 and 3 are normalized with respect to the fre- 
quency of the maximum response, the curves of Figs. 8 and 9 are ob- 
tained, respectively.* Except for the 150-cps case, the phase curves are 
estimated by reading points vertically from Fig. 3. The loO-cps curve 
is a single complete phase response published by Bekesy.' 

* I have replotted these data as carefully as possible from the published curves 
of Bekesy. In reviewing (he literature a small discrepancy appears between the 
amplitude curves published in Akustische Zeitschrift and those which appear later 
in the Handbook of Experimental Psi/rholoau. I judged this to be due to rounding 
and smoothing in redrafting t ho latter, and hence gave more weight to the earlier 
data. 
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Fig. 8 — The experimental displacement data of Fig. 2 plotted with frequency 
normalized in respect to the frequency of maximum displacement. 
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Fig. 9 — Phase responses deduced from data in Fig. 3. Frequency is normalized 
as in Fig. 8. 
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One notices that, except for the 100-cps case, the amplitude curves fall 
close together and represent resonances whose bandwidths are essentially 
constant percentages of the resonant frequencies (i.e., constant "Q"). 
The 100-cps curve is slightly broader than the others. The lower skirt 
of the amplitude curves rises at about 6 db/octave, while the upper skirt 
falls at approximately 20 to 30 db/octave. The total phase change in 
passing through a resonance is of the order of 3t. The phase curves for 
the lower frequency points have the greater slopes (i.e., dQ/du) inside 
the passbands, and the delay for the lower frequency points is therefore 
greater. (This is, of course, as it should be, since the time required to 
propagate energy from the eardrum to points near the apical end of the 
membrane is greater than it is for points lying at the basal end.) 

As a minor digression, it is interesting to notice that the slopes of the 
phase curves in the vicinity of resonance indicate delay values about 
twice as large as the transit times measured by Bekesy. 4 Measuring the 
slopes of the phase curves in this region (again, from the linear plot) 
yields : 

Resonant Frequency, / Phase Delay, dQ/du 2ief(d<t>/du) 

100 cps 11.8 msec 7.4 radians 

150 7.2 6.8 

200 6.4 8.0 

300 4.5 8.5 

These times represent the delays of the frequency components containing 
the greatest portion of the stimulus energy, and do not represent the 
times at which a response first appears (i.e., transit times). Looking 
back at the graphically determined impulse response for the 200-cps 
point (Fig. 7), one sees that the greatest displacement occurs at approxi- 
mately 0.3 milliseconds. The time at which the response essentially be- 
gins is of the order of 2.5 milliseconds, which is in close agreement with 
Bekesy's measurements. It is also interesting to note in passing that 
the product of resonant frequency and delay near resonance (i.e., the 
third column) is roughly constant. This fact will be utilized in adjusting 
the phase response of the models. 

To return to the question of fitting a function to the frequency -domain 
data, at least for the frequency range below 1000 cps, let us consider a 
model whose Laplace transform is the ratio of rational polynomials. 
There will be, of course, an infinite number of possibilities for fitting the 
data, depending upon the criterion and precision of fit. We would, how- 
ever, like to have an approximation that is both computationally simple 
and hopefully adequate to explain certain subjective results in pitch- 
matching. Any criterion of fit must ultimately have its roots in psycho- 
acoustic phenomena. Since such cannot be specified at this time, it would 
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seem that conventional curve-fitting techniques and least-squares cri- 
teria might be discarded in favor of a basically intuitive approach. 

The skirt slopes of the amplitude curves suggest a frequency function 
that has a simple zero in the vicinity of the origin of the complex fre- 
quency plane, and a denominator whose degree is about four or five 
greater than that of the numerator. The relationship between the real 
and imaginary parts of its complex conjugate poles ought to be such as 
to maintain the constant-percentage bandwidth character of the re- 
sponses. The amplitude at resonance ought to vary in the manner pre- 
scribed earlier, and the phase and delay characteristics presumably 
should be representative of the experimental data. (The question of 
phase at low frequencies will necessarily receive some further considera- 
tion.) 

As one of the simpler possibilities for approximating the amplitude 
and phase data, consider a function having two pairs of synchronously 
tuned complex-conjugate poles, one negative-real axis pole, and one 
negative-real axis zero near the origin. Adorned with necessary con- 
stants, such a function has a Laplace transform: 

'.w-^e-tDtprir?!^ (7) 

where : 

Ci is a positive real scale factor which yields the appropriate absolute 

value of displacement; 

/3 4+r is a factor that produces the proper variation in amplitude of reso- 
nance with resonant frequency (if, as previously suggested, a figure of 
5 db/octave rise in the resonant peak is accepted, then r = 0.83); 

e _sr is a delay factor (T seconds) to bring the phase response into 
line with the experimental phase data. 

The function has second-order poles at s = -a ± j0, a simple pole 
at s = _ T an d a simple zero at s = -e. By virtue of the constant-per- 
centage bandwidth properties of the membrane resonances, we let /3 and 
a be related by a constant: = ha. The value of the function for real 
frequencies (i.e., s = ju) is: 



\7 + W 



1 



.('+$-')+' 



.2/3 



f- J '" r . (8) 
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As with the experimental data, it is convenient to work with frequency 
normalized. Let f = (a/0)* Then (8) becomes: 



Fi(tf) = Ci r 




.("■*-<•)+* 



- 2 



•-«•". (9) 



One notices that fitting the phase and amplitude data of Bekesy near 
to zero frequency presents somewhat of a dilemma (as it does with all 
other minimum-phase functions that we have considered). To diminish 
the amplitude response at low frequencies, one needs the zero of the 
function close to the origin. Although the phase at zero frequency ob- 
viously remains zero so long as the function zero is in the left-half plane, 
the phase "bulges" appreciably positive at low frequencies if the zero is 
placed too close to the origin. By empirical adjustment of the parame- 
ters, a compromise position was obtained for the zero, and corresponding 
values for k, T and y were deduced. The values arrived at are: 

fi = °' h k = 2.0, 

3* ° 0) 

- = 1.0, T= '— seconds. 

In order to match phase responses, one notices that the delay, T, is 
taken to vary inversely with the resonant frequency, 0. For the constant 
chosen, the added delay at 100 cps, for example, is approximately 4 milli- 
seconds. This delay, in conjunction with the co-dependent delay, is in 
reasonable agreement with Bekesy 's measurements of transit time down 
the membrane. 

A plot of 

I F x (jt) \ 



where f„ iax is the frequency of peak displacement, is given in Fig. 10. f 
The hatc hed region represents, for comparison, the variability among the 

* This normalizes real frequency with respect to the imaginary part of the pole 
frequency. The latter is not necessarily the same as the frequency of maximum 
response. 

t Note that for the present parameters the resonant peak does not fall exactly 
at f = 1.0, but more nearly at f = 0.95. 
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Fig. 10 — Frequency responses of the models compared with experimental data. 

200, 400 and 800 cps curves of Fig. 8. A plot of ZFxO'f) = <Pi0'f) is 
given in Fig. 11. 

If the experimental phase data at low frequencies are not taken too 
seriously, and the phase of (9) allowed to approach tt/2, then the zero 
might be placed at the origin (i.e., e = 0). The amplitude response for 
this situation is shown by the dashed portion of the | Fi(#) | curve in 

Fig. 10. 

At high frequencies, function (9) attenuates as f , or at about 24 
db/octave. Some of Bekesy's data indicate attenuations slightly greater 
than this. As another possibility, therefore, a function having a simple 
zero at the origin and third-order, complex-conjugate poles was con- 
sidered. Its Laplace transform is: 



F 2 (s) = c % 8 



,5+r 



(ID 



l(s + a y- + /3 2 ] 3 

where the constants are defined in a manner similar to (7). The real fre- 
quency response in terms of normalized frequency is: 

A 

(12! 



F 2 ( jf) = cz8 r 



[( 1 + F-' ! ) +J 1 r ] 



A reasonable fit to the resonant bandwidth is obtained for k = 2.0 with 
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Fig. 11 — Phase responses of the models. 

0T = 3t/4, as before. For these values, a plot of | F 2 (# ) |/| Fstffom) | 
is given in Fig. 10 and ZF 2 (j{) is given in Fig. 11. 

With a thought toward inverse transformations for the approximating 
functions, one function that provides a respectable fit and has a particu- 
larly simple inverse transform is the following: 



8- + 2as + 



FM = C3 



•1 I r 



(■'-0.- 



(13) 



[(« + a) 2 + 2 ] 3 
Or, in terms of the normalized real frequency, 



F 3 (j?) = c£ 



.<i-i-') 



2 



[ft+'-'Htf 



,-«**■ 



(14) 



This function has simple zeros at s = a( — 1 ± fc/v3) and third- 
order poles at s = a( — 1 ± jk). The function obviously becomes non- 
minimum phase for k > V3. Because the separation between zeros is 
2A-/V3, the zero at s = a( — 1 + fc/V3) has the greatest influence on 
amplitude response for the minimum phase conditions (i.e., k < v3). 
For values of k = 1.7 and 07' = 3H-/4, the amplitude and phase responses 
of (14) are shown in Figs. 10 and 11, respectively. 
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V. INVERSE TRANSFORMS OF THE MODELS 

It is pertinent to examine the inverse transforms of the models (7), 
(11) and (13) (i.e., their responses to unit impulses applied at t = 0) 
and to compare these responses with the impulse response obtained for 
the experimental data (Fig. 7). 

Inverse transforming (7) is a particularly cumbersome procedure. In 
the interest of conciseness, the details of the inverse transformations for 
all the functions are relegated to the Appendix. Only the results will be 
used here. For function Fi(s), the impulse response turns out to be: 

/i(0 = Cl /3 1+r ![0.0:« + 0.360/3(< - !F)]<f*'~ r)/ *sin0(i - T) 
+ [0.575 - 0.320/3(* - T^e* 1 *"*"* cos j8(i - T) 
- 0.575 <T' ( '- T) } for * £ T 

/x(0 = for t < T, 

where T is the previously specified delay. 

In a similar manner, the inverse transform of F 2 (s) is: 

/.<0 = 



:i5) 



ctf l+r 



[[Mi - T)f + 0(t _ T) _ A e -fiU-rm gin 0(t _ T) 
+ |-|j9(j - Df + WV ~ T)}e- p{t - Tm cosp(t - 7 7 )J 



(16) 



for / ^ 7', 

/,(«) = for t < T. 

As indicated earlier, the inverse transform of F 3 (s) is particularly 
simple, this being the principal reason for presenting its fit. Its inverse is: 

/,(,) = *£l \J3(1 - T)?e^'- T)n - 7 sin(3(t - T) for J^T 

f.(0 = for t < T. 

For comparison purposes, the impulse responses .A (i), f*(t) and f 3 (t) 
are plotted in Fig. 12, together with the graphically determined re- 
sponse of Fig. 7. In this plot relative delays have been equalized to com- 
pare waveforms. Because the scale constants C\ , c 2 and c 3 have not been 
taken into account, the amplitude scales for the different curves are rela- 
tive. The curves have been plotted, however, for approximately equal 
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Fig. 12 — Impulse responses of the models. These displacement functions are 
the inverse transforms of the frequency-domain data in Figs. 10 and 11. Time delay 
has been equalized to compare waveforms. Locations of absolute origins are given 
in the text. 

peak-to-pcak values. The fits to the experimental data do not seem un- 
realistic, in view of the questions raised earlier. One notices that, in most 
instances, the positive impulses produce the greatest deflection in the 
negative direction. Equalization of the delays to bring the curves into 
coincidence were such as to make the absolute origins (/ft = 0) for each 
response the following number of radians to the left : 



Function 

200 cps, experimental 
fi(t) 
hit) 



Radians to Absolute Origins 

2.3 

1.9 
2.4 
1.5 



Of the functions displayed, / 2 (0 and/ 3 (0 appear to fit the graphically 
derived impulse response better than/i(f) does. In the frequency domain, 
however, I'\(s) appears to afford the slightly better fit. 



VI. RESPONSE OF MODELS TO PERIODIC IMPULSE EXCITATION 

If an excitation of periodic unit impulses is delivered to a linear sys- 
tem, the periodic response is a doubly infinite, linear superposition of 
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responses to single impulses, or: 

g(t) = Z f{t - nr), (18) 

where /(0 is the response to a single impulse, applied at t = 0, t is the 
period of excitation and git) is the periodic response. If F(u) is the 
Fourier transform of fit), it can be shown that: 

git) = 2 l -Fiim)e ia '*\ (19) 

n =—oo T 

where w = 2tt/t is the fundamental frequency of excitation. Because 
git) is a real function of time for a physically realizable system, the am- 
plitude spectrum is even; i.e., | F(w) | = | F( -co) | ; and the phase spec- 
trum is odd; i.e., *(«) = —*(—«). Relation (19) can therefore be 
written: 



git) = —\\ F(0) |+22l F(n«o) I cos [nu<jt + *(nwo)] f • (20) 
2ir { rt=-i J 

By way of example, let us look at the response of function Fii<a) [see 
(8)] to an excitation of periodic impulses. Suppose we first take the case 
where Fi(co) specifies a point on the membrane tuned to the fundamental 
frequency of excitation. Let the resonant frequency of the point be 
j8 a = w - Then f = u/u = nw /wo = n andFi(nwo) = Fi(f = n), and 
the periodic response is: 

gM = ^ (21) 

^(^(f =0)t'2E I F x (f = n)|cos[n3,« + *i(f =»)][. 

9 TT „ — 1 



2x , 
As determined in previous calculations, values of Fi(f) are: 

" f \t(h r \ * (f) ' deerees 

0.06 

1 l 0.67 -248 

2 2 0.08 -534 

3 3 0.01 -706 

Obviously, in this case the displacement response of the membrane is 
principally fundamental, the second harmonic being slightly more than 
one-tenth the amplitude of the fundamental. A plot, on a relative ampli- 
tude scale, of these first four terms is shown in Fig. 13(a). 
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Fig. 13 — Displacement responses of model Fi(s) to excitation by periodic im- 
pulses. The three conditions represent: (a) the displacement of a point on the 
membrane resonant to the fundamental frequency, coo ; (b) the displacement of a 
point resonant to the second harmonic; (c) the same as (b) except with the funda- 
mental frequency component eliminated from the stimulus. 

Consider next a point on the membrane tuned to the second harmonic 
of (he stimulus (i.e., (3 U = 2w = 20,). Then f = w/2u> = nuo/2<a = n/2 
and fi(nwo) = A\(f = n/2). In this case: 



9vU) = 





4tt 



5{«f-«+»£l''(f-dK^ i+ *( f -# 



(22) 
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Fig. 14 — Displacement responses of model Fj(«) to periodic excitation by 
alternate positive and negative impulses. The five conditions represent the dis- 
placements of membrane points respectively resonant to: (a) fundamental fre- 
quency; (b) second harmonic; (c) third harmonic; (d) fourth harmonic and (e) 
fifth harmonic. The dashed curves are the displacements when the fundamental 
component is eliminated from the stimulus. 
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Functional values for this case from previous computations are: 

0(f), decrees 

(I 

-69 
-248 
-422 
-534 
-626 

Because of the form of (9), note that the amplitude scale factors for 
g u (l) and g x (t) are in the ratio (/3„//3 x ) r = 2 r * The response g u (t) of the 
point resonant at the second harmonic of the excitation is plotted in 
Fig. 13(b). 

If the stimulus is ideally high-pass filtered to remove the dc and funda- 
mental terms, then the periodic response at point /3„ is that shown in Fig. 
13(c). 

The shape of a single period at /3 W , with the fundamental present, is 
already similar to the impulse response. If one examines points tuned 
higher in frequency, the time resolution increases because the bandwidth 
increases, and the response becomes more and more identifiable as re- 
peated impulse responses (i.e., nonoverlapping impulse responses). 

An even more instructive insight is obtained if one considers periodic 
excitation by alternately positive and negative impulses. Such a train is 
odd-harmonic in equal amplitude t and, like the repeated positive pulses, 
has a phase spectrum that is zero. To vary the example, let us consider 
the response of /''■•Is) [see ( 11')] to this excitation. Following an approach 
identical to that just described, but dealing only with odd spectral com- 
ponents, the responses of Fig. 14 are obtained. Once again we recall that 
the amplitude scales, shown here as relative, are in the ratio /3 r . 

The response of a point tuned to the fundamental is essentially a 
sinusoid at the fundamental frequency and is shown in Fig. 14(a). The 
displacement of the membrane point tuned to the second harmonic 
(where there is no stimulus energy) is shown in Fig. 14(b). It exhibits a 
displacement in which the fundamental periodicity can be discerned 
when the fundamental component is present. Without the fundamental 
the response is relatively low-amplitude third harmonic. The point tuned 
to the third harmonic, Fig. 14(c), displaces essentially at the third har- 

* The implication here, of course, is that we are still dealing with frequency 
ranges below 1000 ops, where the membrane resonances are assumed to vary in 
peak displacement, as previously discussed. 

t The equal-amplitude spectral lines have twice the amplitude of those for 
repeated positive impulses. 
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monic frequency whether the fundamental is present or absent. The 
point tuned to the fourth harmonic, Fig. 14(d), begins to exhibit funda- 
mental periodicity again, regardless of whether fundamental is present 
or not. The point tuned to the fifth harmonic, Fig. 14(e), yields a re- 
sponse which is very nearly nonoverlapping, superposed impulse re- 
sponses. 

Quantification of the membrane displacement in this manner offers a 
basis for a number of useful speculations on the perception of periodic 
pulses. 

VII. CONCERNING RELATIVE AMPLITUDES OF DISPLACEMENT 

Since relative amplitude of displacement may be of importance in the 
conversion of membrane displacement into nervous activity, it is worth- 
while to examine amplitude relations further. We have seen that, if the 
membrane is excited with periodic impulses at a fundamental frequency 
to which a point near the apical (low-frequency) end is resonant, this 
point executes a displacement which is nearly the fundamental sinusoid. 
A point toward the basal (high-frequency) end, whose resonance curve 
embraces a substantial number of harmonics, yields a periodic response, 
which is essentially a succession of negligibly-overlapping impulse re- 
sponses. Because such points respond simultaneously (except for transit 
delay), and because their peak amplitudes have implications in hy- 
potheses about pitch perception, let us compare the peak amplitudes of 
a "fundamental-responding" point with that of an "impulse-responding" 
point. For the sake of varying the examples further, let us work with 
model F 3 (s), in (13), and its impulse response f 3 (t), in (17). We are in- 
terested in the absolute extremum of (17). The times of the extrema can 
be found by differentiating (17), setting to zero and solving, which gives: 

The envelope maximum occurs at : 

'max envel — 



.(" + A (24) 



It is not necessary to solve the trancendental relation (23), since we al- 
ready have (17) plotted to a reasonable precision in Fig. 12. Using the 
latter data, we get for the absolute maximum of f 3 (t), 

|/.(0 U = ^ (1.4) = (0.23) C3 /3 p I+r , (25) 
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where the subscript p denotes a point toward the high-frequency end of 
the membrane. In a parallel manner, the amplitude of a point, q, tuned 
to the fundamental frequency can be obtained from relation (20). In 
this case, /3 g = o> and 



Cc'o 



ff.(0lr«..i^^2|F,(f = 1) 



(26) 



^-°c 3 /V(0.83) 

■K 

^c 3 /3 8 1+r (0.26). 
The ratio of these two peak displacements is, therefore 

Ri m wou = (0<88) /|Y + ; (27) 

| Qz\t>) I fund \Pq/ 

If the same computations are made for the other two models, Fi(s) 
and F 2 (s), the ratios are: 

«, _ (0.80) (|)' + ', 

fc Y + ' (28) 

E, - (0.82) (&) . 

Since /3 P > q and since the experimentally determined exponent 
r « 0.8, the peak amplitudes of the impulse-responding points exceed 
those of the fundamental-responding points, at least in the frequency 
range below 1000 cps (i.e., roughly over the apical half of the mem- 
brane). 

VIII. EVALUATION OF SCALE CONSTANTS Ci , C 2 AND C 3 

Bekesy's data show that the maximum deflection of the basilar mem- 
brane at a frequency of 1000 cps and a sound pressure level of 134 db 
referred to 0.0002 dyne/cm 2 (i.e., 10 3 dynes/cm 2 ) is approximately 10~ 4 
cm. His measurements also indicate that the mechanical functioning of 
the middle and inner ear is essentially linear to the threshold of feeling. 
In the models, therefore, the constants Ci , c 2 and c 3 should be chosen to 
provide peak displacements at resonance equal to 



(10 7 cm7dyne) 



r g T 

L2tt(1000)J 
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The amplitude responses of the models at resonance are: 

|*itt = 1.0) | = ci£ r (0.60), 

|F 2 (r = 1.0) | = c#(0.92), (29) 

|F 3 (f = 1.0) | = c,/3 r (0.83). 
The values of the constants, therefore, should be : 

10~ 7 



Ci = 



c 2 = 



c 3 = 



(O.G(i)l2ir(1000)] r ' 

10~ 7 

(0.92) [2tt( 1000) \ r ' 

10" 7 

(0.83 )[2x( 1000)]' " 



(30) 



IX. APPLICATION TO PITCH PERCEPTION 

As suggested at the outset, the present, computations were precipi- 
tated by a particular need. In drafting a paper to report two earlier ex- 
periments on pitch perception, 1 - it became painfully obvious, as soon 
as the discussion section was reached, that little quantitative basis 
existed for interpreting the subjective data. The models described here 
were developed in an attempt to alleviate this situation. 

In the pitch experiments it became necessary to explain how three 
different modes of pitch perception arise when periodic pulse trains 
stimulate the ear. One mode ascribes a pitch to the stimulus equal to 
the pulse rate, regardless of the polarity pattern of the train; in other 
words, positive pulses (condensations) are not discriminated from nega- 
tive pulses (rarefactions). A second mode ascribes a pitch equal to the 
mathematical fundamental whether energy is present at this frequency; 
this mode includes the situation which has been labeled "residue" phe- 
nomenon. The third mode assigns a pitch equal to the frequency of the 
lowest spectral component present in the stimulus. 

The first mode characteristically operates at low values of pulse rate 
(usually below 100 pps in unmasked situations). The second usually 
obtains for fundamental frequencies in the approximate range 200 to 
500 cps. The third seems to hold for fundamental frequencies around 1000 
cps and higher when the lowest-frequency component is rejected by 
HP filtering. 

Without launching into the details of the psychophysical experiments, 
the applicability of the models to the perception of pulses can at least 
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be indicated. It is of consequence, for example, to ascertain to what 
extent the subjective pitch modes are manifested in the mechanical 
operation of the cochlea. Looking again at Fig. 14, one can observe dis- 
placement patterns that might be considered favorable for giving rise 
to the pitch modes just outlined. This presumes, of course, certain hy- 
potheses about the mechanism of converting displacement information 
into electrical discharges in the nerve fiber. A discussion of these im- 
portant details, however, is more appropriate in another place. Even so, 
Fig. 14 suggests several things. 

When the membrane is excited over most of its length by a periodic 
pulse stimulus, the higher-frequency portion probably is effective in 
supplying only pulse-rate information, no matter what the polarity 
pattern of the train. In this region of the membrane the pulses are well 
resolved in time (i.e., the displacement is essentially nonoverlapping 
impulse responses), and the "overshoot" of the response to each pulse 
is substantial. Under certain assumptions about the transduction of dis- 
placement into nervous activity, the latter fact can be construed as 
favorable for eliciting nerve volleys in synchrony with each pulse.* 

Information on fundamental frequency might be manifested in two 
ways: (a) If the fundamental component is present in the stimulus, then 
the point on the membrane tuned to the fundamental responds strongly 
with near sinusoidal displacement, (b) If, on the other hand, the funda- 
mental is absent, the lowest-frequency part of the membrane receiving 
excitation will embrace a small number of spectral lines within its fre- 
quency response. Its displacement generally will exhibit the fundamental 
periodicity in a form favorable for triggering one nerve volley per funda- 
mental period. 

So far these comments have not considered the importance of relative 
amplitudes of displacement. This question appears to be of particular 
consequence in evoking the second, or fundamental, pitch mode. Although 
the indications are that most significant neural information originates 
from the point of greatest displacement, there is evidence that subjects 
may give preference to the fundamental mode over the pulse-rate mode 
even though the former may be correlated with smaller membrane dis- 
placements than is the latter. Relative amplitudes of displacement very 
likely undergo nonsimple transformations in the neural conversion proc- 
ess. 

Still open, too, is the question of the third pitch mode. Although our 
models are limited to the frequency range below 1000 cps (because they 

* The iv also is evidence that the transduction may be sensitive to spatial deriv- 
atives of displacement as well as to displacement. This, too, could facilitate per- 
ception of the pulse-rate mode. 
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do not adequately account for middle-ear transmission above this fre- 
quency), an explanation, fabricated of flimsy substance, can be suggested 
for the third mode. Bekesy's data suggest that the amplitude of maximal 
displacement of the membrane falls appreciably (about 12 db/octave 
or more) for frequencies above 1000 cps. In this region, then, that part 
of the membrane responding to the lowest-frequency component would 
exceed in amplitude those parts responding to higher-frequency com- 
ponents. If amplitude of displacement is at all important in the conver- 
sion process (and it most probably is), then the third mode is favored 
provided the lowest-frequency component is not too high in harmonic; 
number. As indicated earlier, the third mode has been observed when 
either the fundamental, or the fundamental and second harmonic, is 
rejected from the stimulus. This mode has obtained in our pitch-match- 
ing experiments for fundamentals in the frequency range around 1000 
cps and slightly higher. 

One final comment is of interest along these same lines. It has been 
reported in the literature that if a periodic train of positive pulses is 
high-pass filtered at around 3000 and 4000 cps, one hears a "residue" 
pitch equal to the fundamental frequency. Our models suggest, how- 
ever, a response more nearly correlated with pulse rate. If one uses a 
stimulus in which pulse rate and fundamental frequency are confounded 
(as with positive pulses), then the former result might obtain. If, on 
the other hand, a stimulus such as alternate positive and negative pulses 
were used, the subjective impression may well be that of pulse rate. If 
the latter is in fact the case, then a fundamental "residue" pitch does 
not exist for this condition.* 
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APPENDIX 

Inverse Transforms for Fi(s), F 2 (s) and F 3 (s) 

When the function Fi(s) of (7) is disencumbered of its constants, the 
problem of inverse transformation amounts to calculating the inverse 

* Since drafting this paper, I have set up the latter experiment and listened to 
alternate positive and negative pulses HP-filtered at 3000 and 4000 cps. I made 
pitch matches fairly consistently at the pulse rate. A second listener, on the other 
hand, made matches that were generally higher than the pulse rate, suggesting 
that my preconceived notions may have influenced my data. It is unequivocal, 
however, that one would not match to the fundamental frequency. 
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transform of: 

KM = (jri) L + ay- + J 

= r ! T+^U l— T (31) 

L(« + «) 2 + 2 J \s + 7/ L(« + «) 2 + 2 J 

= A' a (s) + 7v ft (s). 

The inverses of K a (s) and A't(s) can be obtained in the usual manner 
by making partial fraction expansions in terms of the singularities, ac- 
count being taken of the order of the poles, and evaluating the residues 
in each pole. Or, having got the inverse for K a (s), the inverse for Kb(s) 
can be computed from: 

K b (t) = [(« - y)e- yt ] * [£"X(8)], (.32) 

where * indicates convolution. 

For the present case these standard procedures prove rather cumber- 
some and messy. Because of the favorable initial values of the function 
and its first two derivatives [namely, ki(0 + ) = ki(0+) — fci (0+) = 0], 
derivative relationships can be used to obviate evaluating residues and 
performing the convolution.* The derivative relations of use here are 
the following: If the function /(/) has the Laplace transform F(s), then 

<-«'tt-<wi>. (33) 

and 

^ = *" 7 ' ,(,S,) " • S "" 1/(0+) - S "~V'(0 + ) /" _1 (0 + ). (34) 

We start with two well-known transform pairs: 

• - e~ al sin fit = hi.it), (35) 



(s 4- af 4- /S 2 fi 
and 

(s + a) 
[(a + a) 2 + fi*\ 

Applying (33) through (36) gives 

\(s + a) 2 - fi 2 } , 



e al cos fit = hit). (36) 



[(« + a) 2 + FY 



te at cos fit = thtit). (37) 



* I am indebted to B. F. Logan of the Acoustics Research Department of Bell 
Telephone Laboratories, who pointed out to me the utility of the derivative rela- 
tionships in obtaining transforms for these functions. 
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One notices that K a (s) can be expressed as a simple combination of (35) 
and (37), namely, 

1 _ 1 / J _ (8 + a)*-f? \ (38) 

[(S + «) 2 + fff W \(S + «) 2 + I? K* + «)* + W 

and 

[(« + «) 2 + 2 ] 2 2/3 2 
Application of (34) through (39) gives 

and 

The inverse of # u (s) is, therefore, (39). The inverse of Jj*(«) can be 
obtained from a partial fraction expansion followed by application of 
(39), (40) and (41). Expand K b (s) as: 

1 = _A , g(g) (42 ) 

[(« + a) 2 + /3 2 ] 2 (a + 7 ) ^ l(« + a) 2 + /3 2 ] 2 ' 



\8 + y) 



where 4 is a constant and 

G(s) = (oo + flis + a 2 s 2 + 03S ). 
If A and G(s) are evaluated, one gets 



A = U - 7) 



[(s + a) 2 + /ST 



(e-7) 



[ 7 2 - 2ay + a 2 + tf 2 ] 2 



(43) 



7 
fl , = ,l[ 7 (4a - 7) - 2(3a 2 + (J 2 )], 

a 2 = — i4(4a — 7), 

a ;t = —A. 

The inverse transform of K b (s), therefore, is a summation of terms (39) 
through (41), with the appropriate multiplicative constants. 
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Two differentiations (with respect to s) of (35) give the transform 
pair: 

[(« + «)' - g/3] 1 , 8? .... 

which is the function used as the model F 3 (s) of (13). 
In an essentially parallel manner, one obtains the pair: 

Ks + Jy- + gp ~" 8^ f/ ' l( ^ ¥ + ^ " 3a) + /i2(8a< ~ ^ ¥)] - (45) 

This is the function used as the model F->(s) of (11). 
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Erratum 



On page 747 of "The Theory and Design of Chirp Radars" in the 
July 1960 Bell System Technical Journal, the analytical work attribu- 
ted to A. W. Schelling should correctly be credited to J. C. Schellcng. 



